Impact of Data Size on Data Mining Prediction Accuracy
نویسندگان
چکیده
Prediction accuracy is an essential topic to be researched and studied to have maximum precision and more efficient prediction systems especially in critical fields like medicine and bioinformatics. In this study we investigated the prediction models, viz., Logistic Regression, Neural Network and Decision Tree and compared their prediction performance taking into account the dataset size and the dependent variable values distribution in the dataset as prediction model performance metrics and we used a robust and highly correlated data from the Palestinian Households Census 2007. The Neural Network outperformed the other two models and the Logistic Regression outperformed the Decision Tree. The dependent variable values distribution and dataset size had no significant effect on prediction accuracy but the data content was found to as the most affecting factor on prediction accuracy. Depending on the results of this research we believed and stated that the prediction accuracy is a function of data content and characteristics.
منابع مشابه
Data sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملPersonal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)
Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...
متن کاملA data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing)
Training and adaption of employees are time and money consuming. Employees’ turnover can be predicted by their organizational and personal historical data in order to reduce probable loss of organizations. Prediction methods are highly related to human resource management to obtain patterns by historical data. This article implements knowledge discovery steps on real data of a manufacturing pla...
متن کاملEvaluation and Prediction of the Impact of Parasite Waves and Cell Phone Use by Pregnant Mothers on the Volume of Amniotic Fluid based on Data Mining Algorithms
Introduction: Nowadays, the effects of radiation and constant use of cell phones have led to some problems. These radiations cause disorders in different systems of human body and even in a growing fetus. The aim of this study was to find the effect of using cell phone and internet by pregnant women on the amount of amniotic fluid. Method: First, a questionnaire was designed and evaluated by o...
متن کاملEvaluation and Prediction of the Impact of Parasite Waves and Cell Phone Use by Pregnant Mothers on the Volume of Amniotic Fluid based on Data Mining Algorithms
Introduction: Nowadays, the effects of radiation and constant use of cell phones have led to some problems. These radiations cause disorders in different systems of human body and even in a growing fetus. The aim of this study was to find the effect of using cell phone and internet by pregnant women on the amount of amniotic fluid. Method: First, a questionnaire was designed and evaluated by o...
متن کاملPrediction of mortality in patients admitted to intensive care units, A comparison of three data mining techniques: a brief report.
Background: Early outcome prediction of hospitalized patients is critical because the intensivists are constantly striving to improve patients' survival by taking effective medical decisions about ill patients in Intensive Care Units (ICUs). Despite rapid progress in medical treatments and intensive care technology, the analysis of outcomes, including mortality prediction, has been a challenge ...
متن کامل